antithetic sample
models requested by reviewers
We thank the reviewers for their suggestions. Closely following the techniques used in (Tucker et al. 2017; Grathwohl RELAX requires gradients from a (learned) surrogate function. DisARM, evaluate only the parts of the model selected by the discrete gates. The authors of ARM released an extension ARSM (Yin et al. 2019) for categorical variables and the same However, this would require extending DisARM to the categorical case. ELBO on the training set (left), the 100-sample bound on the test set (middle), and the variance of the gradient estimator (right).
Accelerating Stochastic Gradient Descent Using Antithetic Sampling
But a rather high variance introduced by the stochastic gradient in each step may slow down the convergence. In this paper, we propose the antithetic sampling strategy to reduce the variance by taking advantage of the internal structure in dataset. Under this new strategy, stochastic gradients in a mini-batch are no longer independent but negatively correlated as much as possible, while the mini-batch stochastic gradient is still an unbiased estimator of full gradient. For the binary classification problems, we just need to calculate the antithetic samples in advance, and reuse the result in each iteration, which is practical. Experiments are provided to confirm the effectiveness of the proposed method.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > China (0.04)
Antithetic and Monte Carlo kernel estimators for partial rankings
Lomeli, Maria, Rowland, Mark, Gretton, Arthur, Ghahramani, Zoubin
Noname manuscript No. (will be inserted by the editor) Abstract In the modern age, rankings data is ubiquitous and it is useful for a variety of applications such as recommender systems, multi-object tracking and preference learning. However, most rankings data encountered in the real world is incomplete, which forbids the direct application of existing modelling tools for complete rankings. Our contribution is a novel way to extend kernel methods for complete rankings to partial rankings, via consistent Monte Carlo estimators of Gram matrices. These Monte Carlo kernel estimators are based on extending kernel mean embeddings to the embedding of a set of full rankings consistent with an observed partial ranking. They form a computationally tractable alternative to previous approaches for partial rankings data. We also present a novel variance reduction scheme based on an antithetic variate construction between permutations to obtain an improved estimator. An overview of the existing kernels and metrics for permutations is also provided. Keywords Reproducing Kernel Hilbert Space; Partial rankings; Monte Carlo; Antithetic variates; Gram matrix 1 Motivation Permutations play a fundamental role in statistical modelling and machine learning applications involving rank-M.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Japan (0.04)
- North America > United States > New York (0.04)
- Asia > Middle East > Jordan (0.04)